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Abstract 

We study how to solve sum of squares (SOS) and Lasserre's relaxations for large scale 
polynomial optimization. When interior-point type methods are used, typically only small 
or moderately large problems could be solved. This paper proposes the regularization 
type methods which would solve significantly larger problems. We first describe these 
methods for general conic semidefinite optimization, and then apply them to solve large 
scale polynomial optimization. Their efficiency is demonstrated by extensive numerical 
computations. In particular, a general dense quartic polynomial optimization with 100 
variables would be solved on a regular computer, which is almost impossible by applying 
prior existing SOS solvers. 

Key words polynomial optimization, regularization methods, semidefinite programming, 
sum of squares, Lasserre's relaxation 
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1 Introduction 

Consider the polynomial optimization 

min f(x) s.t. x £ S (1-1) 

where f{x) is a multivariate polynomial and S C M" is a semialgebraic set (defined by a 
boolean combination of polynomial equalities or inequalities) . Recently there has been much 
work on solving (jl.ip by various sum of squares (SOS) or Lasserre's relaxations. The basic 
idea is approximating nonnegative polynomials by SOS polynomials. One of their attracting 
properties is they are equivalent to certain semidefinite programming (SDP) problems. Thus 
they would be solved by SDP packages (like SDPT3 [36], SeDuMi [33], SDPA [7]). Numerical 
experiments show they give good approximations for (jl.ip . which is NP-hard. For instance, 
in minimizing a general class of polynomials, SOS relaxations work very well in finding their 
global minimum [28]. Typically SOS and Lasserre's relaxations are very successful in solving 
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([LTD . We refer to P la HZl [231 [281 EZl [MI for the work in this area. However, their 
applications are very hmited in solving big problems. For instance, to minimize a quartic 
polynomial, it is very difficult to solve its SOS relaxation on a regular computer when it has 
more than 20 variables. So far, SOS relaxations can only be solved for small or moderately 
large problems, which severely affects their practical applications. The motivation of this 
paper is proposing cheaper numerical methods for solving big polynomial optimization. 
A standard semidefinite programming problem is 

min C • X 

XdS^^ (1.2) 
s.t. A{X) =b, X^O. 

Here denotes the space of N x N real symmetric matrices, X ^ (resp. X >- 0) means X 
is positive semidefinite (resp. definite), and • denotes the standard Frobenius inner product. 
Denote by the cone of positive semidefinite matrices in . The C G 5^ and b G M'" are 
constant, and A : — > M."^ is a linear operator. The dual of ()1.2p is 

max b'^y , , 

s.t. A*iy) + Z = C, ZhO. ^ ^ 

Here A* is the adjoint of A. An X is optimal for ()1.2|) and {y, Z) is optimal for ()1.3p if the 
triple (X, y, Z) satisfies the optimality condition 

AiX) -- 

A*{y) + Z =C }. (1.4) 
X,Z '^0, xz = 

There is much work on solving SDP by interior point type methods. Most of them generate 
a sequence {{Xk,yk, Z^)} converging to an optimal solution. At each step, a search direction 
{AX, Ay, AZ) need to be computed. To compute Ay, typically an m x m linear system 
need to be solved. To compute AX and AZ, two linear matrix equations need to be solved. 
Typically the cost for computing Ay is 0{m'^). When m = 0{N), the cost for computing 
Ay is 0{N'^). In this case, solving SDP is not very expensive if N is not big (like less than 
1,000). However, when m = 0{N'^), the cost for computing Ay would be 0{N^), which is 
very expensive even for moderately large N (like 500). In this case, the cost for computing 
Ay is very big. It requires the storage of a matrix of dimension mx m and 0{'m^) arithmetic 
operations. We refer to [351 [33 [SS] for numerical methods solving SDP. 

Unfortunately, SDP problems arising from polynomial optimization belong to the latter 
case that m = 0{N'^), which is why the SDP packages based on interior point methods 
have difficulty in solving big SOS relaxations (like degree 4 with 100 variables). Now let us 
illustrate this. Let p[x) be a polynomial of degree 2d. Then p{x) is SOS if and only if there 
exists a matrix X G [23 such that p{x) = [x]^X[x]d, where [x]d denotes the column 
vector of all monomials up to degree d in graded alphabetical ordering. Note the length of 
[x]fi is N = ("'^'^) . If we write p{x) as 

Pix)= Yl PaX'^' ■ ■ ■ X^" , 
aeN":|«|<2(i 
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n= 


10 


20 


30 


40 


50 


2d = 4 


(66, 1001) 


(231, 10626) 


(496, 46376) 


(861, 135751) 


(1326, 316251) 


n= 


60 


70 


80 


90 


100 


2d=4 


(1891, 635376) 


(2556, 1150626) 


(3321, 1929501) 


(4186, 3049501) 


(5151, 4598126) 


n= 


10 


15 


20 


25 


30 


2d = 6 


(286, 8008) 


(816, 54264) 


(1771, 230230) 


(3276, 736281) 


(5456, 1947792) 


n= 


5 


10 


15 


20 


25 


2d = 8 


(126, 1287) 


(1001, 43758) 


(3876, 490314) 


(10626,3108105) 


(23751,13884156) 


n= 


5 


8 


9 


10 


15 


2d = 10 


(252, 3003) 


(1287,43758) 


(2002, 92378) 


( 3003, 184756) 


(15504,3268760) 



Table 1: A list of sizes of SDP (jl.Sp . In each pair (N,m), N is the length of matrix and m is 
the number of equality constraints. 



then p{x) being SOS is equivalent to the existence of X satisfying 

Aa»X = Pa Va G N" : |q| < 2(i, 

X y 0. ^^-^^ 

Here Aa are certain constant symmetric matrices. The number of equalities is m = ("'^J'^)- 
For any fixed d, m = 0{N'^) = 0{'n?'^). The size of SDP (jl.Sp is huge for moderately large n 
and d. Table [T] lists the size of SDP (jl.5p for some typical values of {n,2d). As we can see, 
when interior point methods are applied to solve (jl.Sp . at each step we need to solve a linear 
system of m variables and two matrix equations. To compute Ay, we need to store an m x m 
matrix and do 0{n^'^) arithmetic operations. This is very expensive for even moderately 
large n and d, and hence severely limits the applicability of SOS relaxations. On a regular 
computer, to solve a quartic polynomial optimization, it is quite difficult to apply interior 
point methods when there are more than 20 variables. 

Recently there has been much work on designing efficient numerical methods on solv- 
ing large scale SDPs. The so-called regularization methods are specially designed for solving 
SDPs whose number of equality constraints m is significantly bigger than the matrix length 
N. We refer to [191 [30l BO] for the work in this area. Their numerical experiments show 
that these methods are practical and efficient in solving large scale SDPs. Currently, these 
methods are only designed for standard SDPs of form (jl.2p or its dual ()1.3p . However, the 
SDPs arising from polynomial optimization are usually in the form of conic semidefinite op- 
timization, that is, they have block structures. Therefore, we first propose the regularization 
methods for solving general conic semidefinite optimization, and then apply them to solve 
polynomial optimization. Our numerical experiments show that these new methods could 
solve significantly bigger polynomial optimization problems efficiently. 

This paper is organized as follows. Section [2] describes the regularization methods for solv- 
ing general conic semidefinite optimization. Section [3] discusses how to solve unconstrained 
polynomial optimization. Section [4] discusses how to solve homogeneous polynomial opti- 
mization. Section [5] discusses how to solve constrained polynomial optimization. Section [6] 
concludes this paper with some discussions. 

Notations. The symbol N (resp., M) denotes the set of nonnegative integers (resp., real 
numbers). For any t € M, [t] (resp., [t\) denotes the smallest integer not smaller (resp., 
the largest integer not bigger) than t. For x S ffi"', Xi denotes the i-th component of x. 
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that is, X = (xi, . . . ,Xn)- The S^~^ denotes the n — 1 dimensional unit sphere {x S : 
+ • • • + = 1}. For a € N", denote \a\ = ai + • • • + a„. The symbol N<fc denotes the 
set {a G N" : \a\ < k}, and Nfc denotes {q € N*^ : |q| = k}. For each i, Cj denotes the i-th 
standard unit vector. The 1 denotes the vector of all ones. For x S M" and a E N^, x° 
denotes x"^ • • -x^". The [x]d denotes the vector of all monomials having degrees at most d 
in graded alphabetical ordering, that is, 

Nd=[l ^1 ■■■ x\ X1X2 xf xf~"^X2 X^], 

and [x'^] denotes the homogeneous part of [x],^ having degree d, that is. 

For a polynomial p(x), supp(p) denotes the support of p(x), i.e., the set of a G N" such that 
the monomial x" appears in p{x). For a finite set 5, IS"! denotes its cardinality. For a matrix 
A, denotes its transpose. The In denotes the N x N identity matrix. For any vector 
= V u^u denotes the standard Euclidean norm. 



2 Regular izat ion methods for conic semidefinite optimization 

This section describes the regularization methods for solving conic semidefinite optimization 
problems having block diagonal structures. They are natural generalizations of the regular- 
izatioin methods introduced in [19^ \30\ [40] for solving standard SDP problems. 



2.1 Conic semidefinite optimization 

Let /C be a cross product of semidefnite cones 

/C = 5^1 X ••• X 5^^ 

It belongs to the space M = S^^ x • • • x 5^^. Each X e M is a. tuple X = {Xi,...,Xi) 
with every Xi E . So X could be thought of as a symmetric block diagonal matrix. Thus 
X G /C if and only if its every block Xi ^ 0. The notation X ^/c (resp. X 0) means 
every block of X is positive semidefinite (resp. definite). For X = {Xi, . . . ,Xi) e Ai and 
Y = (Yi, . . . , ) G M, define their inner product as X = Xi •Yi + ■ ■ ■ + Xi»Yi. Denote 
by II ■ II the norm in Ai induced by this inner product. Note /C is a self-dual cone, that is, 

IC* = {Y e M -.Y • X >0 VXg/C} = /C. 

For a symmetric matrix W, denote by (resp. (W)-) the projection of W into the 

positive (resp. negative) semidefinite cone, that is, if W has spectral decomposition 

W XiUiuJ + ^ XiUiuJ , 

A,>0 Ai<0 

then (VF)+ and iW)- are defined as 

\i>0 Xi<0 
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For X = {Xi, . . . , Xi) E A^, its projections into /C and —IC are given by 

(Xk = ((^i)+, . . . , (Xeh), {X)^^ = . . . , 

The projection function is Lipschitz continuous, because for any X,Y ^KL 

\\X^ - YK.f = \\X - Yf + 2{X^ . Y^K. + Yk • X-ic) 
The projection norm function ||(-)a:|| is convex. To see this, note 



x^K.-y~Kf < \\x-Yf. 



min \\X + F\\. 
fg/c 



Then, for any two X,Y £ A4, it holds 



X + Y 



K. 



< 



< 



X + Y + 



X - {X).K. Y-{Y) 



+ 



-K 



+ 



Y - (y)_ 



K. 
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A general conic semidefinite optimization problem is 

min C • X 

s.t. A{X) = b,Xe}C. 



Here C e M, b e 



and A : M 



is a linear operator. The dual of ()2.ip is 



max h'^y 
s.t. A*{y) + Z = C, Z e)C. 



If A{X) is a summation like 



where every Ai : S 



Ni 



AiX)=Ai{Xi) + ---+Ae{Xe), 
is a linear operator, then its adjoint A*{y) has the form 
A*iy) = iAliy),...,AKy)). 



(2.1) 



(2.2) 



2.2 Regularization methods 

There are two typical regularizations for a standard SDP: Moreau-Yosida regularization for 
the primal (|1.2p and Augmented Lagrangian regularization for the dual (jl.3p . They would 
be naturally generalized to conic semidefinite optimization (j2.ip and its dual (|2.2p . The 
Moreau-Yosida regularization for (j2.ip is 



(2.3) 



Obviously (j2.3p is equivalent to (j2.ip . because for each fixed feasible X G the optimal 
y G in (|2.3p is equal to X. The Augmented Lagrangian regularization for (j2.2p is 



min C«X + Tf ||X-y||2 
s.t. A{X) = b, X elC. 



max 6^y- (Z + ^*(y) - C7) .y- f ||Z + ^*(y) 

s.t. Z eJC. 



C\\ 



(2.4) 
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When K, = is a single semidefinite cone, it was shown (see Section 2 of [19]) that for every 
fixed Y, ()2.4p is the dual of the following optimization 

min C • X + —\\X - Yf - y'^{A{X) -b)-Z»X. 

By fixing y G and optimizing over Z ^ 0, Malick, Povh, Rendl, and Wiegele [19] further 
showed that (j2.4p can be reduced to 

max b^y-^\\(A*{y)-C + Y/a)^f + l\\Yr. (2.5) 



Proposition 2.1. Assume there exists X such that A{X) = b. The following state- 
ments are equivalent: 

(i) {X,y,Z) is optimal for /fO)-(f^. 

(a) {X,y,Z) satisfies 

A{X) = b,X = a{Z + A*{y) -C) + Y, 
X £ IC, Z £ IC, X • Z = 0. 

(Hi) {X,y,Z) satisfies 

X = a{Y/a + A*{y)-C)!c, 

Z = -{Y/a + A*{y)-C).K. (2.6) 
AA*y + A{Z -C) = {b- A{Y))/a. 

The proof of Proposition 2.6 in p!9] would be naturally generalized from a single semidefinite 
cone to the cone /C of block diagonal positive semidefinite matrices. So we omit the proof 
of Proposition 12.11 

In ()2.6p . AA* is a symmetric matrix defined in the way that 

{AA*y)'^z = A*{y)»A*{z) Vy,zGM™. 

The optimality condition (j2.6p leads to the following algorithm for solving (|2.ip : 



Algorithm 2.2. Choose parameters a > 0, e > 0, y G 5^, set Z = 0. Do the loop: 

While {\\b - A{X)\\ > e or \\C - Z - A*{y)\\ > e) 

Solve AA*y = A{C - Z) + {b - A{Y))/u for y. 
Compute the projections 

X = a(Y/a + A*{y) - C)k, Z = -{Y/a + A*{y) - C).^. 

Set Y := X and update a. 

When K. = is a single semidefinite cone, Algorithm 12.21 reduces to the famous Bound- 
ary Point Method (BPM), which was originally proposed by Povh, Rendl, and Wiegele [3D] 
(also see [19j ) for solving big SDPs. This method was proven surprisingly efficient in some 
applications, as illustrated in [191 130j . Algorithm 12.21 can be thought of as a natural gener- 
alization of BPM to general conic semidefinite optimization (j2.ip . It is a specification of the 
following general regularization algorithm: 
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Algorithm 2.3. Choose Y £ , y £ M'", and e^"-, e""*, set Z = and do the loop 
Repeat until ||Z + - C|| < e""** {outer loop): 

Repeat until ||6 — ^(X)|| < e*" {inner loop): 
Compute the projections 

X = a{Y/a + A*{y) - C)^, Z = -{Y/a + A*{y) - C).^- 
Set g = h-A{X). 

Update y ^ y + rWg with appropriate r and W . 

end 

Set Y := X and update a. 

end 

Algorithm 12.31 is also a natural generalization of the regularization framework proposed 
by Malick, Povh, Rendl, and Wiegele [19^ Algorithm 4.3] for solving large scale SDP when 
/C = is a single semidefinite cone. In that case, if W is chosen to be {AA*)~^ and r = l/cj, 
Algorithm 12.31 reduces to Algorithm 12.21 as shown in [191 Prop. 4.4]. Its convergence was 
discussed in [191 Theorem 4.4]. 

For solving large scale SOS relaxations, Algorithm 12.21 might converge fast at the be- 
ginning, but generally has slow convergence when we get close to optimal solutions. This 
is because it is basically a gradient type method. To get faster convergence, we need bet- 
ter solvers for the inner loop of Algorithm 12.31 Typically, Newton type methods have good 
properties like local superlinear or quadratic convergence. This leads to another kind of regu- 
larization methods called Newton-CG Augmented Lagrangian, which was proposed by Zhao, 
Sun and Toh [30]. It gives surprisingly good numerical performance in solving big standard 
SDP problems. In the rest of this section, we generalize this important method to solve 
general conic semidefinite optimization (j2.ip . 

Denote by ipcr{Y,y) the objective in ()2.5p . When /C = , Lpa{Y,y) is differentiable [191 
Proposition 3.2] and 

Vyip^{Y, y) = b- aA(^{A*{y) -C + y/a)yc) • 

The above is also true when /C is a cross product of semidefnite cones. The inner loop of 
Algorithm 12.31 is essentially solving the maximization problem 

max ip^{Yk,y). (2.7) 

Since (fa is concave in y, a point y is a maximizer of (|2.7p if and only if 

Vyipa{Yk,y) = 0. 

The function ip^{Y,y) is not twice differentiable, so the standard Newton's method is not 
applicable. However, the function ip„{Y^y) is semismooth, and semismooth Newton's method 
could be applied to get local superliner or quadratic convergence, as pointed out by Zhao, 
Sun and Toh [40 [ Section 3.2]. So we need to compute the generalized Hessian of (/^o-. If 

A*{y) = {A\{y), A}{y)), C = (Ci, . . . , Q), 

then for Y = {Yi, . . . ,Ye) £ M 

A*{y) -C + Y/a= (^AUy) - Ci + Yi/a, A}{y) -C, + Y,/a) . 
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So we get MY,y) = b^y + ^\\Y\\^ - ELi ^l^\y,y), where 



^i!'\Y,y) = ^\\{Al{y)-a + Y,/a)+f. (2i 



This imphes the formula 

vi^,{Y,y) = -Y,^y^\y.y)- 



k=l 



A numerical method for evaluating Vyip'^\Y,y) was given in [401 Section 3.2]. This can be 



y 

done by computing eigenvalue decompositions. Let 

AUy)-Ck + Yk/a = QkA^''^Ql, 

where Qk G R^^xNk jg orthogonal and A^^'^ = diag(A[^ Aj^^^) is diagonal with A^'^'' > 
^2*^'' > • • • > A^^^ . Denote index sets 

rf ^ = {l<i<Nk: Af ^ > 0}, rL^^ = {l<i<Nk: Af ^ < 0}. 
Then define a symmetric matrix Q^^^ G ]^^feX^fc a,s follows: 



1 iff,jerf, 

ifierL'^ierf , 

otherwise. 



It was shown [301 Section 3.2] that 

Vyj'\Y,y) . z = a^fc(gfc(0W o iQlAUz)Qk))Ql 

where o denotes Hadamard product. Now we define a matrix G M™^"^ satisfying 

£ 



It was shown [A0\ Prop. 3.2] that we always have ^yy — ^- Furthermore, ^yy ^ if some 
nondegeneracy condition holds. In either case, an approximate semismooth Newton direction 
dnew for (j2.7p can be determined by the linear system 



Cy y + e • InJ dnew = ^yfa- (2.9) 

Here e > is a tiny number ensuring the positive definiteness of the above linear system. 

When m is huge, it is usually not practical to solve (|2.9p by direct methods like Cholesky 
factorization. To avoid this difficulty, Zhao, Sun and Toh [ID] proposed applying conjugate 
gradient (CG) iterations to solve (j2.9p approximately. To fasten the convergence of CG, 
a good preconditioner for (j2.9p is (.4^*)"^, as mentioned in [30]. In SOS relaxations for 
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unconstrained and homogeneous polynomial optimization, there is a nice property that the 
resulting {AA*)~^ has explicit neat formulae, as will be shown in Sections [3] and [H So 
we use (AA*)"^ as a preconditioner in this situation. However, in Lasserre's relaxation for 
constrained polynomial optimization, typically there is no explicit expression for {AA*)~^, 
and usually computing {AA*)~^ is quite difficult. In this case, we just use (diag(.A.A*)) 
as a preconditioner, as suggested in [40]. In either case, we set a cap K on the maximum 
number of CG iterations to insure solving (j2.9p do not take too much time. A standard 
preconditioned CG algorithm (like Algorithm 6.12 in [4]) can be applied to solve (12. 9p . The 
numerical experiments in [1^ show that this method is surprisingly powerful in solving some 
big standard SDPs when /C = is a single semidefinite cone. 

Now we present the Newton-CG Augmented Lagrangian regularization method for solving 
general conic semidefinite optimization (j2.ip - (j2.2p . 

Algorithm 2.4. Choose e*'^,e°"* G (0, 1), e > 0, 5 G (0, 1), p > 1, crmi,^,K £ N. 
Initialization: Choose Xq, Zq G , yo G M"^, uo and set A; = 0. 
While {\\Zk + A*iy)k - C\\ < e°"*) {outer loop): 
Set Yfc := Xk. 
Set j = and yuj = yk- 
While {\\yy^Pak{yk,yk,j)\\ < e'"-) {inner loop): 
Set Qj = Vy(pa^^{Yk,yk,j)- 

Compute dnew from (j2.9p by applying preconditioned CG at most K steps. 
Find the smallest integer a > such that 

V^CTfe {Yk, VkJ + (5" • dnew) > V^CTfe {Yk,yk,j) + ■ gjdnew- (2.10) 

Set yfcj+i := Ukj + ^" ■ dnew- 

Compute the projections into the cone fC: 

Xk = a{Yk/a + A*{ykj+i) - C)^, Zk = -{Yk/a + A*{yk,j+i) - 

Seti :=J + 1. 
end {inner loop) 
Set yk+i = ykj 

If (Tfc < fJmax, set (Tfc+i = pfjfc. 

Set k:=k + l. 
end {outer loop) 

When /C = is a single semidefinite cone, the convergence of Algorithm 12.41 has been 
discussed in [40 1 Theorems 3.5, 4.1,4.2]. These results could be generalized naturally to the 
cone /C of block diagonal positive semidefinite matrices. The specifics about the convergence 
are beyond the scope of this paper. We refer to fi9\ [STj [32 1 [30]. 

Algorithm 12.21 and 12.41 are basically specifications of Algorithm 12.31 Generally Algo- 
rithm 12.21 has slower convergence than Algorithm 12.41 does, because it is a gradient type 
method while the latter is a Newton type method. They have similar performance when 
computing eigenvalue decompositions is not expensive. However, when eigenvalue decompo- 
sitions are expensive to compute. Algorithm 12.21 usually takes more time because it requires 
computing more eigenvalue decompositions. In large scale polynomial optimization, the ma- 
trices often have big dimension, and hence Algorithm [23] is more suitable. So in later sections, 
the numerical experiments are mainly based on using Algorithm 12.41 



9 



3 Unconstrained polynomial optimization 



Consider unconstrained optimization problem 

/-„:=min f{x) (3.1) 

where f{x) is a multivariate polynomial of degree 2d. Here f^^^ denotes the global minimum 
of /(x) over R". Problem (j3.ip is NP-hard when deg(/) > 4, as shown in [21]. There has 
much recent work on solving 1^ via SOS relaxations, like HSl 111 [231 [271 [28]. The 
standard SOS relaxation for (13.11) is 



fsos ■= max 7 

s.t. /(a;) - 7 is SOS. ^ ^ 

Obviously the above optimal value /"^^ satisfies the relation /"^^ < Though it is 

possible that /^'^^ < it was observed in [281 EZ] that ()3.ip works very well in practice. 

Typically (j3.2p gives a reasonably well approximation for (j3.ip . which is good enough in 
many applications. SOS relaxations are usually very expensive to solve when interior-point 
methods are used. In this section, we show how to apply regularization methods described 
in Section [2] to solve ()3.2p . We begin with the SDP formulation of (13. 2p . 

Note f{x) — 7 is SOS if and only if there exists X ^ ^ ) satisfying 

f{x) - 7 = [x\lX[x\d = X . {[xUxf^), XtO. 

Here {^'jj^'^) is the total number of monomials in x of degrees up to d. Define 0/1 constant 
symmetric matrices C and in the way that 

[xUx]^ = C+ Yl (3.3) 

oGN":0<|o|<2(i 

Denote U^^ = {a G N" : < \a\ < 2d}. If f{x) = /o + E^gU" /"^^^^ t^^en 7 is feasible for 
()3.2p if and only if X satisfies 

C.X + 7 = /o, 

X ^ 0. 

Let 6 = ifa)aei]^^ be a vector of dimension m = ("JJ'^) — 1. Define A and /C as 

(n+d\ 

A{X) = {Aa»X)aev^^, }C = Sl''. 

Then SOS relaxation (|3.2p is equivalent to 

/- :=min CX 

s.t. A{X) = b, X €]C. ^ ' ' 



The dual of the above is 



max b^y , . 

s.t. A*{y) + Z = C,Z elC. ^ ' 



Here the adjoint A*{y) = ZqgU^^ Va^a- 
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Theorem 3.1. For notations defined above, it holds f^^^ = —f^^p + fo < fmin- V SOS 
relaxation 13.^) is exact, then f^^^ = -f^^p + fo- 

Proof. As we have seen in the above, optimization problems (j3.2|) and (j3.4p are equivalent. 
Their objectives satisfy the relation 

Maximizing 7 in ()3.2p is equivalent to minimizing C • X in ()3.4p . The feasible sets of (13. 2p 
and (|3.4p are the same. Hence we have f^^^ = — /]rfp + /o- Furthermore, /"^^ < fmin is obvious 
because (j3.2p is a relaxation of (|3.ip . 

If SOS relaxation is exact, i.e., f:;,% = fZn^ then obviously fZn = -fsdp + fo- ° 

Now we turn to finding {AA*)~^. Sine Aa defined in (13. 3p are orthogonal to each other, 
the matrix AA* is diagonal. So we get the formula 

(AA*)-'^ = diag(a), where a = ( (l^^^l)"^ : a E U^^). (3.6) 

Here 1 is the vector of all ones. 

Algorithm [23] can be apphed to solve the SDP (133I)-(I33|). We use (AA*)"'^ via the 
formula (|3.6p as a preconditioner. Suppose {X* ,y* , Z*) is optimal for (j3.4p - (|3.5p . Then 
~fsdp + /o is a lower bound for the minimum f^^^. Furthermore, if Z* has rank one, then 
fmin ~ fsos ^^^^ a global miuimizer for (|3.ip can be obtained easily. This can be illustrated 
as follows. When rank(Z*) = 1, the constraint in (j3.5p implies Z* = [x*]d[x*]'^ for some 
X* E M", and hence y* = — [x*]2d- Then, for any x E M", 

-fix*) = -fo + b'^y* >-fo+ Yl = -•^(^)- 

The inequality above follows the optimality of y* and the fact that Z = [x]d[x]'^ is always 
feasible for (j3.5p . So x* is a global minimizer. 

When rank(Z*) > 1, several global minimizers for (j3.ip could be obtained if a so called fiat 
extension condition (FEC) holds. We refer to Curto and Fialkow [3] for FEC, and Henrion 
and Lasserre [8] for how to extract global minimizers from Z* . Typically, FEC fails if there 
are a lot of global minimizers or the SOS relaxation is not exact. 



3.1 Numerical experiments 

Now we show some numerical examples of applying Algorithm 12.41 to solve the SDP (j3.4p - 
(j3.5p which is equivalent to SOS relaxation (j3.2p . The computation is implemented on a 
Linux Desktop of 3.8GB memory and 2. 8GHz CPU frequency. Algorithm 12.41 is implemented 
in Matlab, and its parameters are: e*" = e""* = 10~^, p = 5, ctq = 1, cTmax = 10^, 5 = ^, 
K = 500. The inverse {AA*)~^ via the formula (|3.6p is used as a preconditioner. Its inner 
loop is terminated if it does not converge within 25 steps, and the outer loop is terminated if 
it does not converge within 20 steps. If the optimal Z* satisfies FEC, we extract one or several 
global minimizers x* by using the method in [8]; otherwise, we just set x* = Z*[2 : n + \, 1). 
In either case, the relative error of x* is measured as 

max{l, \f{x*)\} 
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Since f^^^ < < fix*), err = if and only if x* is a global minimizer. So the smaller err 
is, the better the obtained solution is. 

Example 3.2. Minimize the quartic polynomial {x'^)^Ax'^ — x^Bx. Here denotes the 
component- wise square of x, and A, B are symmetric matrices given as 



A 



n 
n — 1 
n-2 

2 
1 



n — 1 n — 2 
n 
n 


2 





n-2 



2 



n 

n — 1 



1 
2 
3 

n — 1 

n 



,B 



1 1 

1 1 

1 

1 

1 1 



1 1 
1 



1 1 
1 1 



When n = 50, the size of the resulting SDP (H^ll-dHS]) is {N,m) = (1326,316250). Solving 
()3.4p - (j3.5p takes about 10 hours. The computed lower bound fg^^ = —1.2919 (only four 
decimal digits are shown). The optimal Z* has rank two and satisfies FEC. So we get two 
solutions from Z* (only four decimal digits are shown): 

±(0.0000, 0.1504, 0.1480, 0.1456, 0.1432, . . . , 0.0584, 0.0572, 0.0560, 0.0548, 0.2188). 

The relative error is about 10~^. So they are global minimizers up to a tiny numerical error. 

Example 3.3. Solve the quartic polynomial optimization 

1 



min (xi + 



When n = 50, the resulting SDP has size {N,m) = (1326,316250). Solving it 

takes about 1.5 hours. We get fg^^ = —0.1246 (only four decimal digits are shown) and 
rank(Z*) = 1. The solution extracted (only four decimal digits are shown) is 



-(0.1127, 0.1127, 0.1126, 0.1125, 0.1124, . . . , 0.1086, 0.1085, 0.1085, 0.1082, 0.1128). 
Its relative error is about 2 • lO-'^. So it is a global minimizer up to a small numerical error. 
Example 3.4. Minimize the following least squares polynomial 




,i=l 

where xq = Xn+i = 0. For n = 16, the resulting SDP (j3.4|) - (|3.5p has size {N,m) = 
(969,74612). Solving (I331)-(l33]) takes about 12 hours. The computed lower bound /J^^'^, = 
7.5586 (only four decimal digits are shown). The optimal Z* has rank two and FEC holds. 
Two extracted solutions (only four decimal digits are shown) are 

(0.0039, 0.6285, 0.5370, 0.0259, -0.4324, -0.4266, 0.1540, -0.5108, 0.2172, -0.4029 

0.4400, -0.4307, 0.0230, 0.5378, 0.6285, 0.0039), 
(0.0039, 0.6285, 0.5378, 0.0230, -0.4307, 0.4400, -0.4029, 0.2172 - 0.5108, 0.1540 

-0.4266, -0.4324, 0.0259, 0.5370, 0.6285, 0.0039). 



The relative error is around 10 ^. So they are global minimizers up to a tiny numerical error. 
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Example 3.5. Solve the least square problem 



i=l \ j=l / \ j=i J 

For n = 15, the resulting SDP (l3aD-(l33]) has size {N,m) = (816,54263). Solving (f3:ill - (f33]) 
takes about 4 hours. The computed lower bound /"^^ = 2.1011 (only four decimal digits are 
shown). Since rank(Z*) = 1, we get a solution (only four decimal digits are shown) 

( 0.5233, 0.2429, 0.0802, 0.0011, -0.0003, 0.0001, -0.0000, 0.0000, -0.0000, 
0.0001, -0.0005, 0.0020, -0.0088, 0.0361, 0.6011). 

Its relative error is around 10~^. So it is a global minimizer up to a tiny numerical error. 

Example 3.6 (Random polynomials). We test the performance of Algorithm 12.41 in solving 
SOS relaxations for minimizing randomly generated polynomials. To insure the polynomials 
have finite global minimum, we generate f{x) randomly as follows 

f{x) = f[xU-i + [x'']^F'^F[x% 

where / € M*^ 2d-i ) is a random vector of Gaussian distribution and F S ^ ' ^ <i ' 
is a random matrix of Gaussian distribution. Here [x'^] denotes the vector of monomials 
of degree equaling d. The generated f{x) almost always has a finite global minimum. The 
computational results are listed in Tables [2]l5l The first column is the number of variables; the 
second column is the size of the resulting SDP (j3.4p - (j3.5p : the third column is the number 
of instances generated; the fourth column is the maximum, median, and minimum time 
consumed by the computer; and the last column is the maximum, median, and minimum 
relative errors of obtained solutions. 



n 


(N,m) 


# inst 


time (min, med, max) 


err (min, med, max) 


20 


(231, 10625) 


20 


0:00:45 


0:01:07 


0:01:59 


(5e-9, 7e-8, 2e-8) 


30 


(496, 46375) 


20 


0:10:45 


0:11:44 


0:12:43 


(6e-9, 5e-8, le-8) 


40 


(861, 135750) 


10 


0:28:42 


1:02:36 


1:22:24 


(3e-9, 3e-8, 9e-8) 


50 


(1326, 316250) 


5 


2:04:43 


3:33:31 


4:15:38 


(6e-8, 4e-7, 9e-7) 


60 


(1891, 635375) 


5 


5:17:42 


8:03:04 


15:23:23 


(7e-8, le-7, 5e-7) 


70 


(2556, 1150625) 


5 


10:25:07 


13:06:38 


20:44:31 


(2e-9, 7e-8, 3e-8) 


80 


(3321, 1929500) 


3 


30:09:34 


32:17:43 


33:30:12 


(6e-8, 8e-8, le-7) 


90 


(4186, 3049500) 


1 


63:55:41 


63:55:41 


63:55:41 


(4e-7, 4e-7, 4e-7) 


100 


(5151, 4598125) 


1 


199:49:33 


199:49:33 


199:49:33 


(6e-7, 6e-7, 6e-7) 



Table 2: Computational results for random (j3.ip of degree 4 



When f{x) has degree 4 (d = 2), the computational results are in Table O As we can see, 
SOS relaxation (j3.2p is solved quite well. For n = 20 ~ 30, solving (j3.4p - (j3.5p takes a couple 
of minutes; for n = 40 ~ 50, it takes a couple of hours; for n = 60 ~ 80, it takes about five 
to thirty hours; for n = 90, it takes about sixty hours; for n = 100, it takes about one week. 
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n 


(N,m) 


# inst 


time (min, med, max) 


err (min, med, max) 


10 


(286, 8007) 


20 


0:01:45 


0:00:54 


0:03:16 


(9e-10, 4e-8, 6e-7) 


15 


(816, 54263) 


10 


0:39:05 


1:07:42 


1:33:51 


(3e-8, 6e-7, 2e-6) 


20 


(1771, 230229) 


3 


11:47:23 


21:35:17 


25:36:08 


(9e-8, 7e-8, le-6) 


25 


(3276, 736280) 


1 


104:41:36 


104:41:36 


104:41:36 


(6,e-7, 6e-7, 6e-7) 


Table 3: Computational results for random (j3.ip of degree 6 


n 


(N,m) 


# inst 


time (min, med, max) 


err (min, med, max) 


8 


(495, 12869) 


20 


0:03:44 


0:07:37 


0:19:42 


(le-8, 3e-7, 8e-6) 


10 


(1001, 43757) 


10 


1:14:48 


1:40:58 


2:20:22 


(9e-9, 3e-8, 4e-6) 


12 


(1820, 125969) 


3 


13:04:06 


13:11:35 


13:34:40 


(8e-8, le-7, 6e-7) 


15 


(3876, 490313) 


1 


217:46:12 


217:46:12 


217:46:12 


(9e-8, 9e-8, 9e-8) 



Table 4: Computational results for random ()3.1|) of degree 8 



When f{x) has degree 6 (d = 3), the computational results are in Table El For n = 15, 
solving (j3.4p - (j3.5p takes about forty to ninety minutes; for n = 20, it takes about ten to 
twenty five hours; for n = 25, it takes about one hundred hours. 

When /(x) has degree 8 {d = 4), the computational results are in Tabled For n = 10, 
solving p.4p - (l3.5p takes about one or two hours; for n = 12, it takes about thirteen hours; 
for n = 25, it takes about two hundred hours. 

When /(x) has degree 10 {d = 5), the computational results are in Table O For n = 6, 
solving (j3.4p - (j3.5p takes a couple of minutes; for n = 8, it takes about a couple of hours; for 
n = 9, it takes about ten to twenty hours; for n = 10, it takes about seventy hours. 

For polynomials generated in this example, SOS relaxation (j3.2p is very successful in 
solving p.ip globally. For all the generated instances, the optimal Z*s have rank one, and 
highly accurate global optimal solutions with tiny errors are found. This is consistent with 
the computational results shown in [28]. The computational results demonstrate that Algo- 
rithm [23] is very efficient in solving SOS relaxations for large scale polynomial optimization. 
Significantly bigger problems would be solved very well. A quartic polynomial optimization 
with 100 variables would be solved on a regular computer. This is almost impossible by 
applying prior existing SOS solvers using interior point type methods. 



n 


(N,m) 


# inst 


time (min, med, max) 


err (min, med, max) 


6 


(462, 8007) 


20 


0:03:18 


0:05:19 


0:06:22 


(2e-8, le-7, 2e-7) 


8 


(1287, 43757) 


10 


1:51:38 


3:47:52 


6:53:22 


(5e-8, 3e-7, 4e-6) 


9 


(2002, 92377) 


3 


9:18:17 


12:09:07 


20:52:36 


(2e-8, 2e-7, 3e-6) 


10 


(3003, 184755) 


1 


72:40:36 


72:40:36 


72:40:36 


(9e-6, 9e-6, 9e-6) 



Table 5: Computational results for random (j3.ip of degree 10 
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4 Homogeneous polynomial optimization 

A frequently encountered polynomial optimization is 

(4.1) 

S.t. \\x\\2 = 1 

where f{x) is a form (homogeneous polynomial). Assume its degree deg(/) = 2d is even, i.e., 
f{x) is an even form. The case that deg(/) is odd can be transformed to minimizing an even 
form over the unit sphere E>'^~^ = {x € M" : ||x||2 = 1}, as will be shown later. In ()4.ip . /^^^ 
denotes the minimum objective value. Problem (|4.ip is NP-hard when d > 1. 
The standard SOS relaxation for (14.11) is 

T.^d ..ar.o (4-2) 



rhmq 

jsos ■■= max 7 

s.t. f{x) - 7 • {x' xy is SOS. 



Let [x'^] be the vector of monomials having degree d ordered lexicographically, i.e., 

[x ] = ^xf X-^ X2 X^ X3 • • • Xn — lX^ ^ x'^ ■ 

Denote = {a G N" : \a\ = d}. For each a € Na, define D^, = ^^H^- Let D = diag(Z)c«) 
be a diagonal matrix. It holds the relation 

(x^xf = D^xl'^' ■ ■ ■ x^"" = [xXd[x'^] = {[xVf) • D. (4.3) 



Define 0/1 matrices Ha such that 

[xVf= ^aXr---X^. (4.4) 

Then /(x) — 7 • (x'^x)'^ being SOS is equivalent to the existence of X satisfying 

fa--fDa/2 =Ha*X VaG2Nd, 

X hO. 

Letting a = 2dei in the above, we get 

7 = /2dei — H2dei • X. 

Denote M.^^ = f^2d\{2dei} , and set r = {ra)aea^^ as 

^iDa/2 if ae2Nd\{2(iei}, 
lo ifaEN2d\{2Na. 



(4.5) 



Define matrices C, and scalars 6q as 

C = H2dei, Aa = Ha-raH2dei, K = fa - raf2dei VoGHarf. (4.6) 
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Let b = {ba)aGM^^- Define linear operators Tl,A : ^l^'*' M^mI and a cone IC as 
Then SOS relaxation (|4.2|) is equivalent to 



The dual of (gZI) is 



s.t. A{X) = b,XelC. 

max 

s.t. A*{y) + Z = C, Z elC. 



(4.7) 



Here .4,*(y) = X^ogjjn^ ya^o is the adjoint of A. 

Theorem 4.1. For notations defined above, it holds fs^^ = —f^^^ + f2dei < f^n- If SOS 
relaxation (3^ is exact, then /^™f = -f^^'^ + f2dei- 

Its proof is the same as for Theorem 13. H and hence is omitted. A general error bound for 
()4.2p approximating (14. ip is given in |24j . 

Now we turn to finding [AA*)~^. Note the matrices defined in (|4.4p are orthogonal 
to each other. Thus 7i7i* is diagonal, and 

{nn*)-^ = diag((l'^^„l)-^ : a G M^^). 

Here 1 is the vector of all ones. Thus, we have for all a, (3 G IHl2^ 

Aa» Ai3 = {Ho, - raH2dei) • {Hp - r/3H2dei) 

= Ha»Hp- H2dei • {raHf^ + rpHa) + rarpH2dei • H2dei 

= Ha» Hp + r^rp, 

that is, AA* = TiTL* + rr^. By Sherman-Morrison formula, we have 

/ . 1 1 {nn*y^rr'^{nn*r^ 

{AA*)"^ = {nn*y^ - — ^ , / . 

Setting vectors 

h ={{l^Hair' : a G M^J, p = (l + r^diag(h)r) diag(h)r, 
we get a neat formula 

{AA*y^ = diag(h) - pp^. (4.9) 

Algorithm [23] can be applied to solve (|4.7p - (|4.8p . We use {AA*)~^ via the formula (|4.9p 
as a preconditioner. Let {X* ,y* , Z*) be optimal for (I4.7p - ()4.8p . Then —f^^^ + f2dei is a 

lower bound for the minimum /^™f . We would also get global minimizers from Z* when 
FEC holds. Note that and imply 

Aa»D = VaGM^rf and Z* • D = H2dej • D = I. (4.10) 
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So Z*{dei,dei) {Z is indexed by integer vectors in N") can not vanish for every i, because 
otherwise we would get Z* = contradicting ()4.10p . Up to a permutation of x, we can assume 
Z*{dei,dei) / 0, and normahze Z* as Z* = Z* /Z*{dei,dei). Then Z* would be thought of 
as a moment matrix of order d in n — 1 variables (see [3]). If Z* satisfies FEC, using the 
method in [8], we can get v^^\ . . . , v^^'^ G M"~^ such that 

r* = xM%[v^'Yd + ■ ■ ■ + K[v^%[v^'r, 



for some scalars Aj > 0. Setting x^*) = (l + Hf^*-*!!!) 



1 



G we get 



z* = i^,[{x('yf][{xW)x + . . . + ^,,[(x(^))'^][(xM)r 

for some scalars Vi > 0. Then the relations (j4.3p and (j4.10p imply i^i + • • • + i/j. = 1. Since 
every Z^*) = is feasible for ()4.8p . the optimality of Z* implies every x^*) is a 

global minimizer for (j4.ip . 

In particular, if rank(Z*) = 1, then Z* satisfies FEC automatically. When FEC fails, it 
is not clear how to extract a minimizer from Z*. This happens usually because there are a 
lot of global minimizers or SOS relaxation (j4.2p is not exact {f^^^ < f^^)- 

4.1 Numerical experiments 

Now we present some numerical examples of applying Algorithm 12.41 to solve the SDP (j4.7p - 
()4.8p which is equivalent to SOS relaxation (|4.2p . As in Section 13. H the computation is 
implemented in Matlab on the same machine. The parameters and stopping criteria in 
Algorithm [23] are also the same as there. The inverse (^.4*)"^ via the formula (j4.9p is used as 
a preconditioner. If the optimal Z* satisfies FEC, we apply the preceding procedure to extract 
one or several global minimizers x*; otherwise, we just choose x* to be the normalization of 
Z*{\ : n, 1). The relative error of x* is measured as 



err 



i/(x*)-/ir^i 



max{l, |/(x*)|}' 

Since /s'^o?^ < — fi^*)'> = if and only if x* is a global minimizer of (I4.ip . 
Example 4.2. Minimize the following square free quartic form over W^~^ 

{-i - 3 + k + tjXiXjXkXi. 

l<i<j<k<e<n 

For n = 50, the resulting SDP gZll-diSl) has size (iV,m) = (1275,292824). Solving (lO) - 
(|4.8p takes about 3.5 hours. The computed lower bound fs^^ = —140.4051 (only four decimal 
digits are shown). The optimal Z* has rank two and FEC holds. There are 2 solutions (only 
four decimal digits are shown) extracted from Z*: 

(0.6393, 0.1217, 0.0827, 0.0575, 0.0401, . . . , -0.1653, -0.1619, -0.1581, -0.1541, -0.1499), 
(0.1499, 0.1541, 0.1581, 0.1619, 0.1653, . . . , -0.0401, -0.0575, -0.0827, -0.1217, -0.6393). 

The error is around 3 • 10~^. So they are global minimizers up to a tiny numerical error. 
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Example 4.3. Minimize the following tridiagonal sexitic form over S' 



l<i<n l<j<n-l 



For n = 20, the size of the resulting SDP g2D-(jMl) is iN,m) = (1540,177099). Solving 
(|4.7p - (|4.8p takes about 10 hours, and the computed lower bound fsos^ = 3.446 • 10~^ (only 
four leading digits are shown). So this form is positive definite, which is contrast to the fact 
that the associated tridiagonal matrix is not positive semidefinite. The optimal Z* has rank 
one, and the extracted solution (only four decimal digits are shown) is 

(0.1474, -0.1811, 0.2024, -0.2176, 0.2290, -0.2375, 0.2439, -0.2484, 0.2514, -0.2528, 
0.2528, -0.2514, 0.2484, -0.2439, 0.2375, -0.2290, 0.2176, -0.2024, 0.1811, -0.1474). 

Its relative error is around 8 • 10"^^. So it is a global minimizer up to a tiny numerical error. 
Example 4.4. Minimize the following sexitic form over S"~^ 

E 222,32 ,23 ,.32 
X ^ X j X ^ I X 2 X j X I X 2 j k I Xj^X j X • 

l<i<j <k<.n 

For n = 20, the resulting SDP l^-l^ has size {N,m) = (1540,177099). Solving 
p.Sp takes about 11 hours. The computed lower bound /i^oT^ = —0.3827 (only four decimal 
digits are shown). The optimal Z* has rank one, and the extracted solution (only four decimal 
digits are shown) is 

-(-0.6551, -0.6286, 0.1109, 0.1089, 0.1071, 0.1054, 0.1038, 0.1023, 0.1009, 0.0996, 
0.0983, 0.0971, 0.0960, 0.0949, 0.0939, 0.0929, 0.0920, 0.0911, 0.0902, 0.0894). 

Its relative error is around 6 • 10^^^. So it is a global minimizer up to a tiny numerical error. 

An important application of homogenous polynomial optimization (|4.ip is estimating 
stability numbers of graphs, as shown below. 

Example 4.5 (Stability number of graphs). Let G = {V,E) be a graph with \V\ = n. The 
stability number q(G) is the cardinality of the biggest stable subset(s) (their vertices are not 
connected by any edges) of V. It was shown in Motzkin and Straus [20] (also see De Klerk 
and Pasechnik [6]) that 

a{G)~^ = min x'^{A + In)x, 

where A„ is the standard simplex in M" and A is the adjacency matrix associated with G. If 
replacing every > by x|, we get 

n 

a{G)-^ = min ^^4 + 2 xfxj. (4.11) 



r 3 1 2 

xf 


"l 


1 







o" 




x'f 


^3 

4 


1 




1 
1 


1 
1 









^3 

4 


















x^ 










1 


1 




X'^ 
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n 


(N,m) 


# inst 


time (mill, med, max) 


20 


(210, 8854) 


20 


0:00:22 


0:03:04 


0:06:25 


30 


(465, 40919) 


10 


0:04:12 


0:39:58 


1:33:47 


40 


(820, 123409) 


10 


0:20:52 


1:21:01 


5:24:12 


50 


(1275, 292824) 


3 


23:50:02 


26:27:12 


27:15:46 


60 


(1830, 595664) 


1 


75:46:58 


75:46:58 


75:46:58 



Table 6: Computational results for stability number of random graphs 



This is a quartic homogeneous polynomial optimization. When a lower bound fsos^ for (j4.1ip 
„a to SOS retaafon g2l .s computed, we »u„d (/."-)-' .„ the nearest mteger which 

will be used as an estimate for a{G). 

We generate random graphs G, and solve the SDP (I4.7p - (l4.8p which is equivalent to its 
SOS relaxation by applying Algorithm l2.4[ The generation of random graphs is in the similar 
way as in Bomze and De Klerk O Section 6]. For n = 20, 30, 40, 50, 60, we generate random 
graphs G = {V,E) with \V\ = n. Select a random subset M CV with \M\ = n/2. The edges 
^ijiihj} 't- generated with probability ^. The computational results are in Table 

O The first column is the number of vertices, the second column is the size of the resulting 
SDP (j4.7p - (|4.8p . the third column is the number of generated instances, and the last column 
is the minimum, median and maximum time consumed by the computer. 

As we can see from Table [6l for n = 20, solving (j4.7p - (j4.8p takes a couple of minutes; for 
n = 30, it takes about four to ninety minutes; for n = 40, it takes about twenty minutes to 
five hours; for n = 50, it takes about twenty five hours; for n = 60, it takes about seventy five 
hours. In all the instances, we get correct stability numbers via solving the SOS relaxation 

of Km . 

Example 4.6 (Random forms). We test the performance of Algorithm 12.41 in solving SOS 
relaxation (14. 2p for random forms. Generate a form f{x) randomly as 

fix) = Yl 

aeN":\a\=2d 

where the coefficients fa are Gaussian random variables. We apply Algorithm 12.41 to solve the 
SDP (j4.7p - (|4.8p which is equivalent to the corresponding SOS relaxation (j4.2p . The random 
forms of degrees 4, 6, 8, 10 are tested. The computational results are in Tables iTlfTOl The 
meaning of their columns are the same as in Tables [2]l5j 

When f{x) has degree 4 (d = 2), the computational results are in Table [71 For n = 30, 
solving (|4.7p - (|4.8p takes a couple of minutes; for n = 40, it takes about thirty to forty minutes; 
for n = 50, it takes a few hours; for n = 60, it takes about six to nineteen hours; for n = 70, 
it takes about fifteen to thirty four hours; for n = 80, it takes about one hundred hours. In 
all the instances, the optimal Z* has rank one and FEC holds. The obtained solutions have 
tiny relative errors. 

When /(x) has degree 6 {d = 3), the computational results are in Table [8j For n = 15, 
solving ()4.7p - (l4.8p takes around seventeen to twenty six minutes; for n = 20, it takes a couple 
of hours; for n = 25, it takes about fifty hours. The optimal Z* now does not always satisfy 
FEC. When n = 10 or 15, there are some instances that their errors are relatively bigger than 
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others. When n = 20, there are a few instances that their relative errors are significantly 
much bigger. This is probably due to the inexactness of their SOS relaxations. For n = 25, 
we have generated only one instance since its computation takes more than two days, and 
luckily a solution of high accuracy was found. But we should not expect this generally. 



n 


(N,m) 


^ inst 


time (min, med, max) 


err (min, med, max) 


20 


(210, 8854) 


20 


0:00:26 


0:00:31 


0:00:39 


(2e-9, 9e-9, 3e-8) 


30 


(465, 40919) 


20 


0:06:04 


0:06:38 


0:08:41 


(4e-9, 8e-9, 2e-8) 


40 


(820, 123409) 


10 


0:35:54 


0:36:03 


0:41:01 


(4e-10, 7e-9, 3e-8) 


50 


(1275, 292824) 


10 


2:29:23 


2:35:09 


3:58:45 


(5e-8, 7e-7, 3e-7) 


60 


(1830, 595664) 


3 


6:31:22 


9:02:36 


19:07:32 


(9e-9, 5e-8, le-8) 


70 


(2485, 1088429) 


3 


15:48:39 


24:57:13 


34:03:29 


(le-10, 6e-9, le-9) 


80 


(3240, 1837619) 


1 


99:26:20 


99:26:20 


99:26:20 


(7e-8, 2e-8, 9e-7) 


Table 7: Computational results for random (j4~T|) of degree 4 


n 


(N,m) 


# inst 


time (min, med, max) 


err (min, med, max) 


10 


(220, 5004) 


20 


0:00:21 


0:00:25 


0:00:30 


(le-9,4e-7,5e-5) 


15 


(680, 38759) 


20 


0:17:02 


0:18:24 


0:26:27 


(6e-9,9e-7,3e-4) 


20 


(1540, 177099) 


5 


5:49:08 


5:51:44 


6:12:59 


(5e-ll,5e-9,3.63) 


25 


(2925, 593774) 


1 


51:05:51 


51:05:51 


51:05:51 


(4e-9, 4e-9, 4e-9) 



Table 8: Computational results for random (|4.ip of degree 6 



n 


(N,m) 


# inst 


time (min, med, max) 


err (min, med, max) 


8 


(330, 6434) 


20 


0:01:56 


0:02:28 


0:03:11 


(4e-9,0.85, 2.08) 


10 


(715, 24309) 


20 


0:29:18 


0:33:55 


0:40:24 


(5e-ll,0.02,2.84) 


13 


(1820, 125969) 


3 


10:36:40 


16:35:53 


18:14:41 


(1.64, 1.91, 3.26) 


15 


(3060, 319769) 


1 


104:41:59 


104:41:59 


104:41:59 


(1.24, 1.24, 1.24) 



Table 9: Computational results for random (j4.1|) of degree 8 



When f{x) has degree 8 (d = 4), the computational results are in Table O For n = 8, 
solving (j4.7p - (|4.8p takes a few minutes; for n = 10, it takes about thirty to forty minutes; 
for n = 13, it takes about ten to eighteen hours; for n = 15, it takes about one hundred 
hours. The optimal Z* usually does not satisfy FEC, and the solutions obtained have very 
low accuracy. For a few instances, very accurate solutions are found. However, for most 
instances, the obtained solutions have big errors. This phenomena is probably due to the 
inexactness of their SOS relaxations. 

When f{x) has degree 10 {d = 5), the computational results are in Table [TOl For n = 8, 
solving (j4.7p - (j4.8p takes about thirty to ninety minutes; for n = 10, it takes about twenty five 
to thirty hours; for n = 11, it takes about one hundred hours. Similar to the case of degree 
8, the optimal Z* usually does not satisfy FEC. For most instances, the solutions obtained 
have big errors. This is probably because of the inexactness of their SOS relaxations. 

From the above, for randomly generated forms, we know SOS relaxation (j4.2p is very 
successful when deg(/) = 4 and also works well when deg(/) = 6. However, when deg(/) = 8 
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n 


(N,m) 


# inst 


time (mill, med, max) 


err (min, med, max) 


6 


(252, 3002) 


20 


0:00:43 


0:00:57 


0:01:16 


(3e-9,0.98, 2.24) 


8 


(792, 19447) 


10 


0:36:26 


0:59:51 


1:28:46 


(0.14, 1.67, 2.15) 


10 


(2002, 92377) 


3 


25:58:15 


29:30:35 


30:37:08 


(3e-3, 0.87, 2.51) 


11 


(3003, 184755) 


1 


110:48:52 


110:48:52 


110:48:52 


(1.27, 1.27, 1.27) 



Table 10: Computational results for random (|4.ip of degree 10 



or 10, SOS relaxation (j4.2p usually fails to find the true global minimum. This observation 
is consistent with the discovery of Blerkherman [1]. On a regular computer. Algorithm 12.41 
would solve big homogeneous polynomial optimization problems which are not solvable by 
prior existing SOS solvers. This is a big advantage of regularization type methods. 

4.2 Minimizing odd forms over unit spheres 

In some applications, we might need to optimize an odd form f{x) (deg(/) is odd) over S"""^. 
This problem is usually also very difficult. Nesterov [22] proved that it is already NP-hard 
when deg(/) = 3. For odd forms, SOS relaxation (j4.2p can not be applied directly. However, 
after a reformulation, this problem is equivalent to minimizing certain even forms. 

Let f{x) be an odd form of degree 2d — 1, and f^^^ be its minimum on S"~^. Let 
f{x,t) = f{x)t be a new even form in {x,t). Consider homogeneous optimization 



G ■■= „ ^ fix)t. (4.12) 

It was shown in [2^ that 

fhmg Jid - 1 {\ - — ^ f'""^ 
J min l|l Jmin- 

Since the form f{x,t) = f{x)t is even, SOS relaxation ()4.2p can be applied to get a lower 
bound /i'oT^ for /^™f . Then 

/ir^ := ^/2d^ (l - ^) '2 
is a lower bound for f^^- When a minimizer (x,t) is obtained for (j4.12p . the normalization 



-d 

chmg 

I SOS 



I mm 

TTl — l 



of x/i is then a minimizer of f{x) over 
Example 4.7. Minimize the following cubic form over 

f{x) = ^ ^ {XiXjXk + ^i^j ^i^k ~l~ -^j-^k)- 

l<i<j<k<n 

We first transform it to (I4.12p . and then solve its SOS relaxation by Algorithm 12. 4[ For 
n = 49, the computation takes about four hours, and the lower bound found is —124.9645 
(only four decimal digits are shown). The optimal Z* has rank one, and the extracted solution 
(only four decimal digits are shown) is 

-(0.0320, 0.0344, 0.0368, 0.0394, 0.0420, . . . , 0.2304, 0.2434, 0.2600, 0.2836, 0.3238). 

Its relative error is about 10~^. So it is a global minimizer up to a tiny numerical error. 
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5 Constrained polynomial optimization 

Consider general constrained polynomial optimization 

= ^=S (5.1) 
s.t. gi{x)>0,...,gi{x)>0, 

where /(x) and gi{x), . . . ,ge{x) are all multivariate polynomials and have degrees no greater 
than 2d. Problem (j5.ip is NP-hard, even when f{x) is quadratic and the feasible set is a 
simplex. Lasserre's relaxation is a typical approach for solving ()5.ip approximately. We refer 
to la mi EZl [M] for the work in this area. 

The d-th Lasserre's relaxation (d is also called the relaxation order) for (jS.ip is 

fsos ■■= max 7 

s.t. f{x) -J = ao{x) + gi{x)ai{x) -\ 1 ge{x)ai{x), 

ao{x),ai{x), . . . ,a£{x) are SOS, 
deg(o-o), deg(cJifi(i), . . . , deg{aege) < 2d. 

Let N{k) = ("+'^'), di = [deg(5i)/2] and goix) = 1. Then 7 is feasible for ([521) if and only if 
there exists G S^^'^''^'^ {i = 0,1, . . . J) such that 

fix) - 7 = E<7.(^)NJ_, = . {g^{x)[xU.,Ax]LciJ^ 

Define constant symmetric matrices Aa^ , Aa^ , • • • , Aa^ in the way that 

g^{x)[xU-dAx]d-d, = Yl ^a^^"' ^ = 0,1,...,! (5.3) 

QeN":|a|<2ci 

Denote A^ = . . . , ^if^), X = (X^, X^, . . . , X^), and define 

IC = X X • • • X 

Recall that U^^ = {a G : < \a\ < 2d}. If f{x) = fo + EaGU^^ then 7 is feasible 



for ()5.2p if and only if there exists X satisfying 

Ao»X + j = fo, 

X € JC. 

Now define A,b,C as 

^(X) = {Aa • ^)agUJ,, ^ = (/a)aGUJ,, C = Aq. (5.4) 

The vector b has dimension m = N{2d) — 1. Then SOS relaxation (15. 2p is equivalent to 

/,^^p":=min CX 

s.t. A{X) = b,XelC. ^ ^ 
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The dual of ([53]) is 




(5.6) 



Here the adjoint A*{y) has the form 




Theorem 5.1. For notations defined above, it holds f^,^ = -f^^^^ + fo < f^^n- V SOS 
relaxation \5.2\) is exact, then = —fgdp + fo- 

The proof is the same as for Theorem 13.11 and hence is omitted. A general approximation 
bound analysis for Lasserre's relaxation is given in |25j . 

Algorithm 12.41 can be applied to solve the conic SDP (|5.5p - (|5.6p . Unlike in Sections [3] 
and m there is usually no explicit formula for {AA*)~^ in Lasserre's relaxation, and it is 
typically quite difficult to compute {AA*)~^. So we just simply use (diag(^^*)) as a 
preconditioner. Suppose {X* ,y* , Z*) is optimal for (|5.5p - (|5.6|) . Then —fg^p + /o is a lower 
bound for the minimum f^Jl. The information for minimizers could be obtained from Z* . 
Note Z* = (Z*, Z*, . . . , Z;). Since Z* G /C, every Z* ^ 0. When Z* has rank one, f^^^ = /™," 
and a global minimizer for (15. ip can be extracted. Actually, if rank(ZQ) = 1, ()5.3p implies 
Zq = for some x* G M", and hence y* = -{x*)^ for every a G V^^t- Furthermore, 

the structures of Z* implies that Z* = gi{x*)[x*]d-di[x*]"^_^.- Hence, every gi{x*) > and x* 
is feasible for (jS.ip . Then, for every x £ feasible in (j5.ip . we have 



is feasible for ()5.6p . So x* is a global minimizer of (15. ip . 

When rank(ZQ) > 1, if FEC holds, several global minimizers could be extracted, e.g., by 
the method in [^. Typically, FEC fails if there are a lot of global minimizers or Lasserre's 
relaxation (j5.2p is not exact. If this happens, it is not clear how to get minimizers. 

5.1 Numerical experiments 

Now we present some numerical examples of applying Algorithm 12.41 to solve the conic SDP 
(|5.5p - (|5.6p which is equivalent to Lasserre's relaxation (j5.2p . As in Section \3A] the computa- 
tions are implemented in Matlab on the same machine. The parameters and stopping criteria 
of Algorithm 12.41 are also the same as there. The (diag(^^*)) is used as a preconditioner. 
If FEC holds, we use the method in [8] to get one or several global minimizers x*; otherwise, 
just set X* = Z*(2 : n + 1, 1). In either case, the relative error of x* is measured as 



Note we always have fg°g < f^^- If the obtained x* is feasible for (j5.ip . then err = if and 
only if X* is a global minimizer. 



-fix*) = -fo + b^y* >-fo+ Yl = -/(^) 



because y* is optimal and for every x feasible in (jS.ip 



Z = i[x]d[x]]i , gi{x)[x]d-dAx]li-d^, ■ ■ ■ , [x]d-dAx]d-d,) 



err = 



max{l,|/(x*)|}' 
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Example 5.2. Minimize the polynomial 

n n n n 

i=l l<«<i<" l<i<j<k<n l<i<j<k<l<n 

over the unit ball i?(0, 1) = {x G M" : ||x||2 < !}• It has degree 4 and is a summation of 
square free monomials. We apply the 2""' Lasserre's relaxation (|5.2|) to solve this problem. 
The resulting cone /C has 2 blocks. When n = 50, solving ()5.5p - (j5.6|) takes about 5 hours. The 
computed lower bound = —1.5240 (only four decimal digits are shown). The optimal 
Z* has rank 50 and FEC holds. There are 50 solutions extracted. One of them (only four 
decimal digits are shown) is 

(0.8852, -0.0665, -0.0665, -0.0665, -0.0665). 

It has unit norm, and hence is feasible. Its relative error is 7 • 10~^. So it is a global 
minimizer up to a tiny numerical error. Actually all its permutations are also solutions of 
the same accuracy, since the objective polynomial is symmetric in x. 

Example 5.3. Minimize the polynomial 

E2 2 ■ 3 -3 
X^ X I IX^ j I J'^i'^j 

l<i<j<n 

over the hypercube [— 1,1]"" = {x G M" : xf < 1}. It has degree 4. So we apply the 2"'^ 
Lasserre's relaxation ()5.2p to solve this problem. The resulting cone /C has n + 1 blocks. 
When n = 40, solving (|5.5p - (|5.6p takes about 22 hours, and the computed lower bound 
/so" — —2751.5 (only the first 5 digits are shown). The optimal Z* has rank two and FEC 
holds. The two extracted solutions (only 4 decimal digits are shown) are: 

±(1, 1, 1, 1, 1, 1, 1, 0.9434, 0.8920, 0.8483, 0.8105, 0.7775, 0.7482, 0.7220, 0.6985, 0.6771, 
0.6576, 0.6397, 0.6232, 0.6080, 0.5938, 0.5806, 0.5682, 0.5566, 0.5457, 0.5354, 0.5257, 

0. 5165.0.5077,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1). 

They are obviously feasible solutions. The relative error is around 6- 10"''. So they are global 
minimizers up to a small numerical error. 

Example 5.4. Minimize the polynomial 

EQ r> 2 2 

t"^ J I i I j i j 

l<i<j<n 

over the hypercube [—1, 1]" = {x G : xf < 1}. The degree is 4. So we apply the 2"'^ 
Lasserre's relaxation ()5.2p . The resulting cone /C has n + 1 blocks. For n = 50, solving (j5.5p - 
()5.6p takes about 18 hours. The computed lower bound = —3600 (rounded to integers). 
The optimal Z* has rank one, and the solution extracted is (rounded to integers) 

(-1,-1,1,-1,-1,1,1,-1,-1,1,1,-1-1,-1,-1,1,-1,1,1,-1,1,1,-1,-1, 

1, -1,-1,-1,1,-1,-1,-1,1,1,-1,-1,1,1,1,-1,-1,1,1,1,1,-1,-1,1,1,1). 

It is clearly a feasible solution. Its relative error is zero, so it is a global minimizer. 
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Example 5.5. Minimize the polynomial 

n 

i=l l<*<i<'^ 

over the unit ball .6(0, 1). It has degree 6. So we apply the 3'''^ Lasserre's relaxation ()5.2p . 
The resulting cone K, has 2 blocks. When n = 20, solving (j5.5|) - (j5.6p takes about 57 hours. 
The computed lower bound /™" = —20 (rounded to integers). The optimal Z* has rank one, 
and the solution extracted (rounded to integers) is 

(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1). 

It is feasible and has zero relative error. So it is a global minimizer. 

Example 5.6. Consider the polynomial optimization 

X] + + 0' + k)xfx^jxl 

l<i<j<k<n 

The constraint is the unit ball defined by the standard 4-norm. The degree is 6. We apply 
the 3'''^ Lasserre's relaxation (15. 2p to solve this problem. The resulting cone /C has 2 blocks. 
When n = 15, solving (I5.5p - ()5.6|) takes about 1.5 hours. The computed lower bound /™" = 
—575.5928 (only four decimal digits are shown). The optimal Z* has rank one, and the 
solution extracted (only four decimal digits are shown) is 

-(0.4034, 0.4274, 0.4486, 0.4674, 0.4839, 0.4983, 0.5107, 0.5211, 0.5296, 0.5363, 
0.5410, 0.5437, 0.5444, 0.5430, 0.5393). 

It has unit 4-norm, and hence is feasible. The relative error is around 4 • 10~'^. So it is a 
global minimizer up to a small numerical error. 

Example 5.7. Consider the polynomial optimization 

^ ^ XiXjXi^(^l + Xj -|- Xj + x/j) -|- ix^ -\- jx^ + kx^ 

l<i<j<k<n 

X-y ~\~ ' ' ' ~\~ Xn ^ 1, Xn_j_-j^ -j- * * * -j- X^ ^ 1 

where n is an even integer. The degree is 6, and the constraint is a bi-unit ball under the 
4-norm. We apply the 3'''^ Lasserre's relaxation (j5.2p to solve this problem. The resulting 
cone K, has 3 blocks. For n = 16, solving (j5.5p - (j5.6|) takes about 19 hours, and the computed 
lower bound fg°^ = —1.1078 (only four decimal digits are shown). The optimal Z* has rank 
one, and the extracted solution (only four decimal digits are shown) is 

-(0.2418, 0.2208, 0.2085, 0.2000, 0.1934, 0.1882, 0.1838, 0.1800, 0.1767, 0.1738, 
0.1712, 0.1688, 0.1667, 0.1647, 0.1629, 0.1612). 

It is a feasible solution, and the relative error is around 4 • 10~^. So it is a global minimizer 
up to a tiny numerical error. 



min 
s.t. 



min 

S.t. 



25 



Example 5.8. Consider the polynomial optimization 

mm ^ iXiXjXk+jx!i^iX^+jXii+k + kxiXjXkX'^+iXn^jXn+k 

l<i<j<k<n/2 

S.t. X-^ ~1~ * * * ~1~ Xn ^ 1, Xn_^-^ ~\~ ' ' ' ~\~ X^ ^ 1 

where n is even. Since the degree is 6, we apply the S*""^ Lasserre's relaxation (|5.2|) . The cone 
IC has 3 blocks. For n = 20, solving (j5.5p - (j5.6p takes about 32 hours. The computed lower 
bound = —153.6180 (only four decimal digits are shown). The optimal Z* has rank one, 
and the solution extracted (only four decimal digits are shown) is 

-(-0.3642, 0.3955, 0.5042, 0.5589, 0.5892, 0.6049, 0.6109, 0.6104, 0.6057, 0.5991, 
0.5828, 0.5173, 0.5193, 0.5306, 0.5459, 0.5619, 0.5763, 0.5869, 0.5919, 0.5896). 

The first and second subvector both have unit 4-norm, so it is a feasible solution. Its relative 
error is around 5 • 10~^. So it is a global minimizer up to a tiny numerical error. 



5.2 Sparse SOS relaxation for constrained optimization 

In many applications, polynomials in optimization are often sparse. It is important to exploit 
sparsity for computational efficiency. We refer to [TOl fT2| fTT| [T^ [26| [29| [38] . 

Suppose each polynomial gj only involves a subvector xj^ = {xi : i £ Ij} for an index set 
/j C {1, 2, . . . , n}. Consider the sparse polynomial optimization 

min fix) 

xeM- (5.7) 
s.t. gi{xi^)>0,...,gi{xi^)>0. 

The sparse generalized Lagrangian function (see Waki et al. |38j ) for the above is 

e 

L{x,(l)) = f{x) - '^(j)j{xi.)gj{xj^), 

where (j)j{xi.) are SOS polynomials in xj.. Let {Ci, . . . ,Cr} be the set of maximal cliques 
of a chordal extension of correlative sparsity pattern (csp) matrix of L{x, (p) (see Waki et al. 
[38]). A sparse Lasserre's relaxation of order d proposed in [38] for (|5.7p is 

max 7 

r I 
s.t. f{x) - 7 = Y^diixcJ + Y^<l)j{xi^)gj{xi^), 
i=i j=i 

deg((Ti),deg(</)jfi(j) < 2d, ai,(pj are SOS, 

i = 1,. . . ,r,j = 1,. . . 



Like the general dense case, (|5.8p can also be formulated in the form of conic semidefinite 
optimization (12. ip . For convenience, we still use the same notations as before. Note 7 is 
feasible for (j5.8p if and only if there exists X^^^ {i = 1, . . . ,r + £) satisfying 

/(x)-7= E[xcJlX«[xcJd+ E5,(^)[^/JI_,,^('^+^'n^/.]d-d,, 
1=1 j=i 

^ 0, . . . , h 0, x^'^+i) ^ 0, . . . , h 0. 
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Let W be the set of exponents of nonconstant monomials appearing in (j5.8p . For a S W, 

(1) (r-\-i) 

define constant symmetric matrices Aa , ■ ■ ■ , Aa in the way that 



Qe{o}uw 

aG{0}UW 



It is obvious that 



The dual of the above is 



min C • X 

s.t. A{X) = b,Xe}C. 

max b'^y 
s.t. A*{y) + Z = C, Z elC. 



The adjoint A*{y) now takes the form 



(5.9) 



Denote = . . . , ^i'^^^^), X = {XW ^ . and define 

(\Ci\+d\ (\Cr\+d\ (\Il\+<i-di\ 

If /(x) = /o + EaeW /a^"' constraint of (j5.8p is equivalent to 

+ 7 = /o, 

= fa VaGW, 

X E /C. 

Similar to (j5.4p . define 6, C as 

^(X) = (A« • X)«gW, &=(/a)aeW, C = Ao. (5.10) 
The dimension of 6 is m = |W|, the cardinality of W. So (15. 8p is equivalent to 



(5.11) 



(5.12) 



VaGW qGW aGW / 

The information about minimizers would be obtained from Z* . We refer to Laurent and 
Mourrain |18j . 

Example 5.9. Consider the following sparse polynomial optimization 

min E (E(^. + ^f)- E ^l^i] 

S.t. 1 - llxjJII > 0,A: = 1,... ,34. 
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Here the index set Jk = {j '■ max(l,3fc — 3) < j < miii(n, 3/c + 2)}, and n = 100 is the 
total number of variables. The degree is 6. We apply the sparse Lasserre's relaxation (I5.8p 
of order 3. Since the csp matrix here is banded, its csp graph is chordal. So the set of 
maximal cliques is {Ji, J2, . . . , J33} (note J34 C J33). We apply Algorithm 12.41 to solve the 
resulting conic SDP (j5.1ip - ()5.12p . It takes about 15 minutes, and the computed lower bound 
is —194.8056 (only four decimal digits are shown). To get a minimizer, we just simply set 
X* = y*. for i = 1, . . . , n, and get the solution (only four decimal digits are shown) 

- (0.7066 * 2, ( 0.6394 * 3, 0.6385 * 3 ) * 16, 0.7076 * 2) . 

In the above, the product a * t means a is repeated for t times in a row. The above solution 
has relative error around 8 • 10~^, so it is a global minimizer up to a tiny numerical error. 

6 Conclusions and some discussions 

This paper proposes regularization type methods for solving SOS and Lasserre's relaxations 
in large scale polynomial optimization. After a description of these methods for general conic 
semidefinite optimization, we show how to apply them in solving unconstrained, homoge- 
neous and constrained polynomial optimization. As demonstrated by extensive numerical 
experiments, significantly larger problems would be solved efficiently by them. For a quartic 
polynomial optimization problem with 100 variables, its SOS relaxation would be solved on 
a regular computer, which is almost impossible by applying prior existing SOS solvers. 

The proposed regularization methods for solving large scale polynomial optimization are 
still in their very early stages. Much work need to be done for improving their practical 
performance. From our experience, the following issues are important. 

• The performance of Algorithm 12 .41 depends very much on CG iterations for solving linear 
system (12. 9p . In the inner loop, if (12. 9p is solved efficiently by CG, then Algorithm 12.41 
usually converges fast; otherwise, a lot of time would be spent on solving (j2.9p . which 
finally makes Algorithm 12.41 run very slow. So the conditioning of ()2.9p seriously affects 
the performance. Generally, the accuracy parameters in Algorithm 12.41 can not be set 
too small. 

• For unconstrained and homogeneous polynomial optimization, there are explicit formu- 
lae for the inverse (^.4*)"^ which is used as a preconditioner for solving ()2.9p by CG. For 
constrained polynomial optimization, there is no explicit formula for {AA*)^'^, and we 
just use (diag(^.A*)) as a preconditioner. Generally, is there a better preconditioner 
in polynomial optimization? This is not clear for us. 

• It is critical to update at in Algorithm 12.41 From our numerical experience, it is better 
to choose relatively bigger p if (j2.9p is solved accurately by CG; otherwise, we should 
choose relatively small p. Generally, what is the best strategy for updating Cfc? 

• Algorithms 12.21 and 12.41 are basically special cases of Algorithm l2.3i Is there a more effi- 
cient specification of Algorithm 12.31 for solving large scale SOS or Lasserre's relaxations? 
Or does polynomial optimization have any special features in applying Algorithm 12.31 
or its specifications efficiently? This is not clear for us. 
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The regularization methods in Section [3] are designed for solving general big conic semidef- 
inite optimization (12. ip . A big advantage of them is the low memory requirement for storing 
matrices and vectors during the computation. Generally, the computer memory could be 
used to a big extent for storing problem data and the computation does not require much 
more extra memory. Therefore, if a big SDP fits the computer memory, we can always run 
the algorithm while possibly waiting for a long time. 

Acknowledgement The author would like very much to thank Bill Helton and Igor Klep 
for fruitful discussions on this work. 
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